The main dataset used in this tutorial is about the geolocalisation of french restaurants in Paris. Data is extracted from an official register called SIRENE (Computer system for the business and establishment register) managed by the French National Institute of Statistics and Economic Studies (Insee) and geolocated by Etalab (French task force for Open Data). This register records the civil status of all companies and their establishments (including restaurants). SIRENE has the advantages of being rigorous and exhaustive on the French territory.

Exercise 1 : Manipulate sf objects and associated data.frames

1

Import the iris1 map layer ‘iris_75.shp’ of Paris.
Use sf::st_read().
Reading layer `iris_75' from data source `/home/tim/Documents/prz/satRday/exercises/data/iris_75.shp' using driver `ESRI Shapefile'
Simple feature collection with 992 features and 4 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 643075.6 ymin: 6857477 xmax: 661086.2 ymax: 6867081
epsg (SRID):    NA
proj4string:    +proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +units=m +no_defs

2

Display the basemap of Paris with plot(iris_75). What do you notice ?
We notice that R performs 4 graphs: one graph per variable in the sf object.

3

What is the functionality of the sf::st_geometry() function? What solution do you propose then?
sf::st_geometry() makes it possible to isolate the information contained in the ‘geometry’ column of the sf object. Using it, we put aside other variables (here CODE_IRIS, P14_POP, AREA and CODE_COM).

4

Import the restaurant layer ‘sir.shp’ and display a map of Paris with its restaurants.
Use st_read() and sf::st_geometry().
Reading layer `sir_75' from data source `/home/tim/Documents/prz/satRday/exercises/data/sir_75.shp' using driver `ESRI Shapefile'
Simple feature collection with 23348 features and 7 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: 643502.1 ymin: 6857645 xmax: 659766.4 ymax: 6867041
epsg (SRID):    NA
proj4string:    +proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +units=m +no_defs

5

Count the number of restaurant by iris.
Use sf::st_intersects() and sapply().
Simple feature collection with 6 features and 5 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 650181.1 ymin: 6861761 xmax: 652179 ymax: 6863138
epsg (SRID):    NA
proj4string:    +proj=lcc +lat_1=49 +lat_2=44 +lat_0=46.5 +lon_0=3 +x_0=700000 +y_0=6600000 +ellps=GRS80 +units=m +no_defs
  CODE_IRIS P14_POP      AREA CODE_COM                       geometry RESTAU
1 751010101     974  64006.07    75101 MULTIPOLYGON (((652096.5 68...     44
2 751010102     167  62671.06    75101 MULTIPOLYGON (((652052.7 68...      6
3 751010103     246  47653.41    75101 MULTIPOLYGON (((651869 6862...     16
4 751010104       3 222656.99    75101 MULTIPOLYGON (((651639.9 68...     10
5 751010105       0 243255.23    75101 MULTIPOLYGON (((650369.8 68...      0
6 751010199       0 234277.01    75101 MULTIPOLYGON (((652096.5 68...      0

6

Using the layer called ‘iris_75’, create a new aggregated map layer called ‘com_75’ which corresponds to the Paris Arrondissements. Also keep in this new layer the information on the population, area and number of restaurant in each municipality.

Information

The map layer called ‘iris_75’ contains the 5 digit codes of arrondissemnt in its variable CODE_COM.
Use the classic functions of dplyr package: select, group_by et summarize. These functions also work with sf objects.

Exercise 2 : Static maps

We would like to design a map of Paris arrondissments that combines the number of restaurants and the number of restaurants per 10,000 inhabitants.

1

Data preparation:

  • Load the layer called ‘com75_shp’ (which contains the population and the number of restaurants in each arrondissement) and create a variable called rest_per_10k which corresponds to the number of restaurants per 10,000 inhabitants in each territory.
  • Create a vector of quantiles breaks of the rest_per_10k variable.
  • Create the vector colors which corresponds to a the number of classes defined earlier.
  • For ggplot2 maps, add a variable called typo to ‘com_75’ which indicates the class of the territory according to the discretization contained in bks for the rest_per_10k variable.

Information

For the creation of ‘bks’ et ‘cols’, use the getBreaks et carto.pal functions of the cartography package. For the creation of the typo variable, you can use the cut function and apply the parameters digit.lab = 2 and include.lowest = TRUE.

2

With the help of cartography package, make the following map which contains in a choropleth layer the variable rest_per_10k and in a proportional circle layer the variable RESTAU. You can also try to do the same map using ggplot2 and tmap packages.


  1. In French, IRIS is an acronym of ‘aggregated units for statistical information’. Their target sizes are 2000 residents per basic unit.